Compare verbal and visual

We regressed compliance on social conformity, approval of the rules, and trust in science. There was a significant effect of conformity (\(b=0.4\), \(t=34\), \(p<.01\)) and of approval (\(b=0.27\), \(t=22\), \(p<.01\)) but not of trust in science (\(b=0\), \(t= 0.8\), \(p=.48\)).

Compare verbal and visual

It’s not just verbal vs. visual: how much visual?

It’s not just verbal vs. visual: how much visual?

Grammar of graphics

  • Be able to built up a plot in layers

  • Flexible in how you map data to graphical elements

  • Can be tricky at the beginning, but gets easier

  • Remember: rely on googling error messages + trying stuff

Grammar of graphics: Aesthetics

  • Data \(\longrightarrow\) columns
  • Plot \(\longrightarrow\) elements:
    • x, y axes
    • color, fill
    • size, shape, linetype, …
  • Aesthetic aes() is just mapping data column to plot element

Grammar of graphics: Geoms

  • How to picture each aesthetic
    • Bar
    • Point
    • Line
    • Density curve/violin/etc.

Example: 1 aesthetic, 3 geoms

data <- data.frame(score=rnorm(80, 5, 4))
myplot <- ggplot(data, aes(x=score))

Example: 1 aesthetic, 3 geoms

myplot + geom_density()

Example: 1 aesthetic, 3 geoms

myplot + geom_histogram()

Example: 1 aesthetic, 3 geoms

myplot + geom_dotplot()

Points: aesthetics vs style

ggplot(data_scatter,
       aes(x=x,
           y=outcome)) + 
  geom_point()

Points: aesthetics vs style

ggplot(data_scatter,
       aes(x=x,
           y=outcome)) + 
  geom_point(alpha=0.4,
             color="royalblue")

Points: aesthetics vs style

ggplot(weight_data,
       aes(x=height_cm,
           y=weight_kg,
           color=gender)) + 
  geom_point(alpha=0.4)

Line plots

ggplot(data_time, 
       aes(x=time,
           y=value,
           color=stock)) + 
  geom_line()

Stats

stat_summary(fun=…) - using the same data, generate some summaries

stat_smooth(method=…) - generate a spline

Stat_summary

ggplot(weight_data, 
       aes(x=gender, y=weight_kg)) + 
  geom_point(alpha=0.4) + 
  stat_summary(fun = "mean",
               color = "red",
               geom = "point",
               shape = 18,
               size = 4)

Stat_summary

If you’re ever having trouble with stat_summary geom=“line” (lines look crazy, lots of up/down lines): add aes(group=…)

Stat_smooth

ggplot(full_weight_data,
       aes(x=height_cm,y=weight_kg)) + 
  geom_point(alpha=0.4)+
  stat_smooth()

Stat_smooth

ggplot(full_weight_data,
       aes(x=height_cm,y=weight_kg)) + 
  geom_point(alpha=0.4)+
  stat_smooth(method=lm)

Labels

It’s very tempting to just use the default column names

But it is definitely worth adding more explicit labels:

  • labs(x=…, y=…, color=…, title=…)

Marginal plots

Marginal plots

Marginal plots

Nice packages

Exercise

  • Using weight_data, create a plot of weight ~ gender

  • Use all of the following geoms/stats:

    • geom_violin
    • geom_point with position=position_jitter()
    • stat_summary (for the mean)
  • Make nice labels

  • IF you have enough time:

    • use geom_errorbar() to add confidence intervals
      • hint: you can have different datasets for different geoms
      • use the bootMeans function I’ve provided below
    • add geom_segment and geom_annotate for significance bars/stars